home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Plus Special 16
/
AMIGAplus Sonderheft 16 (1998)(ICP)(DE)[!].iso
/
pd
/
anwendungen
/
ispell-3.1.18bin
/
doc
/
builddictionary.doc
< prev
next >
Wrap
Text File
|
1995-09-21
|
6KB
|
147 lines
Building new dictionaries
~~~~~~~~~~~~~~~~~~~~~~~~~
by Jesper Skov
Making new dictionaries is not very hard, but you'll have to interpret the
Makefile yourself.
Below is a complete example where I rebuild the English and Danish
dictionaries after re-compiling the ispell programs with the MASKBITS
variable set to 128.
[First the fix8bit tool is compiled - you will only need this with certain
languages, e.g. Danish as we have funky letters :) ]
>cd languages/
>make fix8bit
+ gcc -O2 -DAMIGA -Iinclude: -o fix8bit fix8bit.c
[Then the English dictionary is build. It consists of multiple wordlists so I
use sort to construct a single wordlist. You may control what sub lists are
included, thus changing the size and "power" of the dictionary. See the
Makefile for some pre-defined dictionary sizes.]
>cd english/
>dir
-----rw-d 4 1769 Jan 23 1995 altamer.0
-----rw-d 1 402 Nov 2 1994 altamer.1
-----rw-d 2 856 Nov 2 1994 altamer.2
-----rw-d 18 8831 Jan 23 1995 american.0
-----rw-d 9 4410 Jan 23 1995 american.1
-----rw-d 80 40591 Jan 23 1995 american.2
-----rw-d 19 9477 Jan 23 1995 british.0
-----rw-d 9 4500 Jan 23 1995 british.1
-----rw-d 81 41194 Jan 23 1995 british.2
-----rw-d 364 186058 Jan 23 1995 english.0
-----rw-d 270 137937 Jan 23 1995 english.1
-----rw-d 618 316348 Jan 23 1995 english.2
-----rw-d 338 172832 Jan 23 1995 english.3
-----rw-d 14 6916 Jan 25 1994 english.4l
-----rw-d 12 5688 Jan 23 1995 english.aff
-----rw-d 35 17536 Nov 2 1994 Makefile
----arwed 27 13670 Jan 23 1995 msgs.h
Dirs:0 Files:17 Blocks:1901 Bytes:969015
>bin:sort -u -t/ +0f -1 +0 -o english.med english.0 american.0 altamer.0 british.0 engl
ish.1 american.1 altamer.1 british.1
>dir
-----rw-d 4 1769 Jan 23 1995 altamer.0
-----rw-d 1 402 Nov 2 1994 altamer.1
-----rw-d 2 856 Nov 2 1994 altamer.2
-----rw-d 18 8831 Jan 23 1995 american.0
-----rw-d 9 4410 Jan 23 1995 american.1
-----rw-d 80 40591 Jan 23 1995 american.2
-----rw-d 19 9477 Jan 23 1995 british.0
-----rw-d 9 4500 Jan 23 1995 british.1
-----rw-d 81 41194 Jan 23 1995 british.2
-----rw-d 364 186058 Jan 23 1995 english.0
-----rw-d 270 137937 Jan 23 1995 english.1
-----rw-d 618 316348 Jan 23 1995 english.2
-----rw-d 338 172832 Jan 23 1995 english.3
-----rw-d 14 6916 Jan 25 1994 english.4l
-----rw-d 12 5688 Jan 23 1995 english.aff
-----rwed 688 351911 Sep 14 15:57 english.med
-----rw-d 35 17536 Nov 2 1994 Makefile
----arwed 27 13670 Jan 23 1995 msgs.h
Dirs:0 Files:18 Blocks:2589 Bytes:1320926
>buildhash english.med english.aff english.hash
Counting words in dictionary ...
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000
17000 18000 19000 20000 21000 22000 23000 24000 25000 26000 27000 28000 29000 30000 310
00 32000
32433 words
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 1600
0 17000 18000 19000 20000 21000 22000 23000 24000 25000 26000 27000 28000 29000 30000 3
1000 32000
>dir
-----rw-d 4 1769 Jan 23 1995 altamer.0
-----rw-d 1 402 Nov 2 1994 altamer.1
-----rw-d 2 856 Nov 2 1994 altamer.2
-----rw-d 18 8831 Jan 23 1995 american.0
-----rw-d 9 4410 Jan 23 1995 american.1
-----rw-d 80 40591 Jan 23 1995 american.2
-----rw-d 19 9477 Jan 23 1995 british.0
-----rw-d 9 4500 Jan 23 1995 british.1
-----rw-d 81 41194 Jan 23 1995 british.2
-----rw-d 364 186058 Jan 23 1995 english.0
-----rwed 1 6 Sep 14 15:58 english.0.cnt
-----rwed 5 2106 Sep 14 15:58 english.0.stat
-----rw-d 270 137937 Jan 23 1995 english.1
-----rw-d 618 316348 Jan 23 1995 english.2
-----rw-d 338 172832 Jan 23 1995 english.3
-----rw-d 14 6916 Jan 25 1994 english.4l
-----rw-d 12 5688 Jan 23 1995 english.aff
-----rwed 2255 1154482 Sep 21 14:20 english.hash
-----rwed 688 351911 Sep 14 15:57 english.med
-----rwed 1 6 Sep 21 14:20 english.med.cnt
-----rwed 5 2107 Sep 21 14:20 english.med.stat
-----rw-d 35 17536 Nov 2 1994 Makefile
----arwed 27 13670 Jan 23 1995 msgs.h
Dirs:0 Files:23 Blocks:4856 Bytes:2479633
>copy english.aff english.hash english.med english.med.cnt english.med.stat
\ispell:lib
>cd /
[Now rebuild the Danish dictionary. There is only one word list so sort is
not used. The fix8bit tool is used to 8-bit correct the affix file.
BTW: the word list is found at one of the suggested sites in
languages/Where. It is not part of the Ispell distribution.]
>cd dansk/
>dir
-----rw-d 11 5464 Jan 23 1995 dansk.7bit
-----rw-d 632 323386 Jun 29 19:53 dansk.med
-----rw-d 9 4594 Nov 2 1994 Makefile
Dirs:0 Files:4 Blocks:663 Bytes:338758
>../fix8bit -8 < dansk.7bit > dansk.aff
>dh3:ispell-3.1.18Work/buildhash dansk.med dansk.aff dansk.hash
Counting words in dictionary ...
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000
17000 18000 19000 20000 21000 22000 23000 24000 25000 26000 27000
27606 words
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 1600
0 17000 18000 19000 20000 21000 22000 23000 24000 25000 26000 27000
>dir
-----rw-d 11 5464 Jan 23 1995 dansk.7bit
-----rwed 11 5314 Sep 21 14:40 dansk.aff
-----rwed 2091 1070528 Sep 21 14:41 dansk.hash
-----rw-d 632 323386 Jun 29 19:53 dansk.med
-----rwed 1 6 Sep 21 14:41 dansk.med.cnt
-----rwed 5 2106 Sep 21 14:41 dansk.med.stat
-----rw-d 9 4594 Nov 2 1994 Makefile
Dirs:0 Files:7 Blocks:2760 Bytes:1411398
>copy dansk.aff dansk.hash dansk.med dansk.med.cnt dansk.med.stat ispell:lib
dansk.aff..copied.
dansk.hash..copied.
dansk.med..copied.
dansk.med.cnt..copied.
dansk.med.stat..copied.
>
That's it. I hope this little document will make it easier for you to build
dictionaries. If there are any "bugs" in this doc, please inform me thereof!
/Jesper